Skip to content

Conversation

@bclenet
Copy link
Contributor

@bclenet bclenet commented Nov 4, 2025

This example aims at showing provenance for a study dataset.

The usecase is manual brain segmentations performed by several experts from a single T1w file.

@bclenet bclenet changed the title [BEP028] provenance - study dataset [BEP028] provenance - study dataset + manual annotation Nov 4, 2025
@bclenet
Copy link
Contributor Author

bclenet commented Nov 7, 2025

@cmaumet would you agree on completely removing the software provenance objects in this example ?
I'm afraid it creates an ambiguity as with the current version we are not able to describe if the software was solely responsible for generating the data or if a human intervened in the process.

@cmaumet
Copy link

cmaumet commented Nov 13, 2025

@cmaumet would you agree on completely removing the software provenance objects in this example ? I'm afraid it creates an ambiguity as with the current version we are not able to describe if the software was solely responsible for generating the data or if a human intervened in the process.

Yes @bclenet I concur 💯

@bclenet
Copy link
Contributor Author

bclenet commented Nov 19, 2025

@cmaumet following our discussion from yesterday, I created a provEntity for the source file in this example because it is not in the dataset where the provenance is described (source file is in the sourcedata/raw dataset, provenance is in the derivatives/seg dataset). Nothing in the current version of the specification PR seems to encourage or discourage this. We only have:

Each file with a ent suffix is a JSON file describing provEntities.
These files SHOULD not contain provEntities describing data files that are available in the dataset. Use sidecar JSON files instead for this purpose (see Provenance of a BIDS file).

Which we could change to :

These files SHOULD not contain provEntities describing data files available in BIDS datasets.

But in some (most ?) cases we don't have access to sidecar JSONs for files outside the dataset. This causes a problem for storing Digest and Type metadata.

@cmaumet
Copy link

cmaumet commented Nov 19, 2025

@bclenet - I think the hope was that we could declare the subdataset in dataset links and use an identifier based on that link (which therefore would make it possible to refer to BIDS files in other datasets). Does that make sense? If not let's go through the example together!

@bclenet
Copy link
Contributor Author

bclenet commented Nov 19, 2025

@cmaumet yes, we can refer to a BIDS file in another dataset using its BIDS URI only. But we cannot provide further provenance metadata for this file unless we create a provEntity in a provenance file for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants